RLWS: A Reinforcement Learning based GPU Warp Scheduler

نویسندگان

Jayvant Anantpur

Nagendra Dwarakanath Gulur

Shivaram Kalyanakrishnan

Shalabh Bhatnagar

R. Govindarajan

چکیده

The Streaming Multiprocessors (SMs) of a Graphics Processing Unit (GPU) execute instructions from a group of consecutive threads, called warps. At each cycle, an SM schedules a warp from a group of active warps and can context switch among the active warps to hide various stalls. Hence the performance of warp scheduler is critical to the performance of GPU. Several heuristic warp scheduling algorithms have been proposed which work well only for the situations they are designed for. GPU workloads are becoming very diverse in nature and hence one heuristic may not work for all cases. To work well over a diverse range of workloads, which might exhibit hitherto unseen characteristics, a warp scheduling algorithm must be able to adapt on-line. We propose a Reinforcement Learning based Warp Scheduler (RLWS) which learns to schedule warps based on the current state of the core and the long-term benefits of scheduling actions, adapting not only to different types of workloads, but also to different execution phases in each workload. As the design space involving the state variables and the parameters (such as learning and exploration rates, reward and penalty values) used by RLWS is large, we use Genetic Algorithm to identify the useful subset of state variables and parameter values. We evaluated the proposed RLWS using the GPGPU-SIM simulator on a large number of workloads from the Rodinia, Parboil, CUDA-SDK and GPGPU-SIM benchmark suites and compared with other state-of-the-art warp scheduling methods. Our RL based implementation achieved either the best or very close to the best performance in 80% of kernels with an average speedup of 1.06x over the Loose Round Robin strategy and 1.07x over the Two-Level strategy. Keywords-GPU; Warp Scheduling; Divergence;

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GA3C: GPU-based A3C for Deep Reinforcement Learning

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage ActorCritic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU’s computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for ot...

متن کامل

University of Oklahoma Graduate College Reinforcement Learning Scheduler for Heterogeneous Multi-core Processors Reinforcement Learning Scheduler for Heterogeneous Multi-core Processors a Thesis Approved for the School of Computer Science

متن کامل

Scheduling Straight-Line Code Using Reinforcement Learning and Rollouts

The execution order of a block of computer instructions can make a difference in its running time by a factor of two or more. In order to achieve the best possible speed, compilers use heuristic schedulers appropriate to each specific architecture implementation. However, these heuristic schedulers are time-consuming and expensive to build. In this paper, we present results using both rollouts ...

متن کامل

Analyzing the Tensile Behavior of Warp-Knitted Fabric Reinforced Composites Part I: Modeling the Geometry of Reinforcement

متن کامل

Basic-block Instruction Scheduling Using Reinforcement Learning and Rollouts

The execution order of a block of computer instructions on a pipelined machine can make a difference in its running time by a factor of two or more. In order to achieve the best possible speed, compilers use heuristic schedulers appropriate to each specific architecture implementation. However, these heuristic schedulers are time-consuming and expensive to build. We present empirical results us...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1712.04303 شماره

صفحات -

تاریخ انتشار 2017

RLWS: A Reinforcement Learning based GPU Warp Scheduler

نویسندگان

چکیده

منابع مشابه

GA3C: GPU-based A3C for Deep Reinforcement Learning

University of Oklahoma Graduate College Reinforcement Learning Scheduler for Heterogeneous Multi-core Processors Reinforcement Learning Scheduler for Heterogeneous Multi-core Processors a Thesis Approved for the School of Computer Science

Scheduling Straight-Line Code Using Reinforcement Learning and Rollouts

Analyzing the Tensile Behavior of Warp-Knitted Fabric Reinforced Composites Part I: Modeling the Geometry of Reinforcement

Basic-block Instruction Scheduling Using Reinforcement Learning and Rollouts

عنوان ژورنال:

اشتراک گذاری